105 research outputs found
Hierarchical Metric Learning for Optical Remote Sensing Scene Categorization
We address the problem of scene classification from optical remote sensing
(RS) images based on the paradigm of hierarchical metric learning. Ideally,
supervised metric learning strategies learn a projection from a set of training
data points so as to minimize intra-class variance while maximizing inter-class
separability to the class label space. However, standard metric learning
techniques do not incorporate the class interaction information in learning the
transformation matrix, which is often considered to be a bottleneck while
dealing with fine-grained visual categories. As a remedy, we propose to
organize the classes in a hierarchical fashion by exploring their visual
similarities and subsequently learn separate distance metric transformations
for the classes present at the non-leaf nodes of the tree. We employ an
iterative max-margin clustering strategy to obtain the hierarchical
organization of the classes. Experiment results obtained on the large-scale
NWPU-RESISC45 and the popular UC-Merced datasets demonstrate the efficacy of
the proposed hierarchical metric learning based RS scene recognition strategy
in comparison to the standard approaches.Comment: Undergoing revision in GRS
Information Seeking Behavior of Research Scholars of Vidyasagar University, West Bengal
The main objective of the study is to investigate the information seeking behavior of the research scholars of Vidyasagar University (VU), West Bengal. Besides, the study also intended to identify their information needs and awareness regarding the library services rendered by the central library of the university. Required data was collected from 100 researchers of the university through a structured questionnaire. Findings indicate that guidance in the use of library resources and services is necessary to help researchers to meet some of their information requirements. Most of the researchers avail library weekly (30%) and monthly(45%) and spend maximum of 0-2 hours with the main intention for preparing research, study journals , up-to-date knowledge etc. Most of the researchers depend on VU central library (27%) for information seeking and also collect information from other sources
CMIR-NET : A Deep Learning Based Model For Cross-Modal Retrieval In Remote Sensing
We address the problem of cross-modal information retrieval in the domain of
remote sensing. In particular, we are interested in two application scenarios:
i) cross-modal retrieval between panchromatic (PAN) and multi-spectral imagery,
and ii) multi-label image retrieval between very high resolution (VHR) images
and speech based label annotations. Notice that these multi-modal retrieval
scenarios are more challenging than the traditional uni-modal retrieval
approaches given the inherent differences in distributions between the
modalities. However, with the growing availability of multi-source remote
sensing data and the scarcity of enough semantic annotations, the task of
multi-modal retrieval has recently become extremely important. In this regard,
we propose a novel deep neural network based architecture which is considered
to learn a discriminative shared feature space for all the input modalities,
suitable for semantically coherent information retrieval. Extensive experiments
are carried out on the benchmark large-scale PAN - multi-spectral DSRSID
dataset and the multi-label UC-Merced dataset. Together with the Merced
dataset, we generate a corpus of speech signals corresponding to the labels.
Superior performance with respect to the current state-of-the-art is observed
in all the cases
Prototypical quadruplet for few-shot class incremental learning
Many modern computer vision algorithms suffer from two major bottlenecks:
scarcity of data and learning new tasks incrementally. While training the model
with new batches of data the model looses it's ability to classify the previous
data judiciously which is termed as catastrophic forgetting. Conventional
methods have tried to mitigate catastrophic forgetting of the previously
learned data while the training at the current session has been compromised.
The state-of-the-art generative replay based approaches use complicated
structures such as generative adversarial network (GAN) to deal with
catastrophic forgetting. Additionally, training a GAN with few samples may lead
to instability. In this work, we present a novel method to deal with these two
major hurdles. Our method identifies a better embedding space with an improved
contrasting loss to make classification more robust. Moreover, our approach is
able to retain previously acquired knowledge in the embedding space even when
trained with new classes. We update previous session class prototypes while
training in such a way that it is able to represent the true class mean. This
is of prime importance as our classification rule is based on the nearest class
mean classification strategy. We have demonstrated our results by showing that
the embedding space remains intact after training the model with new classes.
We showed that our method preformed better than the existing state-of-the-art
algorithms in terms of accuracy across different sessions
GOPro: Generate and Optimize Prompts in CLIP using Self-Supervised Learning
Large-scale foundation models, such as CLIP, have demonstrated remarkable
success in visual recognition tasks by embedding images in a semantically rich
space. Self-supervised learning (SSL) has also shown promise in improving
visual recognition by learning invariant features. However, the combination of
CLIP with SSL is found to face challenges due to the multi-task framework that
blends CLIP's contrastive loss and SSL's loss, including difficulties with loss
weighting and inconsistency among different views of images in CLIP's output
space. To overcome these challenges, we propose a prompt learning-based model
called GOPro, which is a unified framework that ensures similarity between
various augmented views of input images in a shared image-text embedding space,
using a pair of learnable image and text projectors atop CLIP, to promote
invariance and generalizability. To automatically learn such prompts, we
leverage the visual content and style primitives extracted from pre-trained
CLIP and adapt them to the target task. In addition to CLIP's cross-domain
contrastive loss, we introduce a visual contrastive loss and a novel prompt
consistency loss, considering the different views of the images. GOPro is
trained end-to-end on all three loss objectives, combining the strengths of
CLIP and SSL in a principled manner. Empirical evaluations demonstrate that
GOPro outperforms the state-of-the-art prompting techniques on three
challenging domain generalization tasks across multiple benchmarks by a
significant margin. Our code is available at
https://github.com/mainaksingha01/GOPro.Comment: Accepted at BMVC 202
Pharmacognostical, physiochemical and phytochemical evaluation of leaf, stem and root of orchid Dendrobium ochreatum
The present work was aimed to carry out pharmacognosical and phytochemical evaluation of individual root, stem, and leaves of orchid “Dendrobium ochreatum.” The plant was sun dried and was grounded to fine powder using mechanical grinder followed by sieving. The fine powder was collected and subjected to different pharmacognostical studies like fluorescence analysis under uv light at different wavelength. Physiochemical parameters were also evaluated of the dried plant parts like ash values, loss on drying. Each part of plant like root, stem and leaves were separated and subjected to extraction using soxhletion using different polarity solvents i,e hexane, chloroform, ethanol in gradient elution technique. All total nine plant extracts were obtained, phytochemical screening revealed the presence of important phytoconstituents like alkaloid, glycoside, saponins etc whereas only chloroform extracts of stem exhibited the presence of steroid/phytosterol
CognitiveCNN: Mimicking Human Cognitive Models to resolve Texture-Shape Bias
Recent works demonstrate the texture bias in Convolutional Neural Networks
(CNNs), conflicting with early works claiming that networks identify objects
using shape. It is commonly believed that the cost function forces the network
to take a greedy route to increase accuracy using texture, failing to explore
any global statistics. We propose a novel intuitive architecture, namely
CognitiveCNN, inspired from feature integration theory in psychology to utilise
human-interpretable feature like shape, texture, edges etc. to reconstruct, and
classify the image. We define two metrics, namely TIC and RIC to quantify the
importance of each stream using attention maps. We introduce a regulariser
which ensures that the contribution of each feature is same for any task, as it
is for reconstruction; and perform experiments to show the resulting boost in
accuracy and robustness besides imparting explainability. Lastly, we adapt
these ideas to conventional CNNs and propose Augmented Cognitive CNN to achieve
superior performance in object recognition.Comment: 5 Pages; LaTeX; Published at ICLR 2020 Workshop on Bridging AI and
Cognitive Scienc
- …